Impact of Missing Data on Phylogenies Inferred from Empirical Phylogenomic Data Sets
نویسندگان
چکیده
منابع مشابه
Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes ...
متن کاملHow should species phylogenies be inferred from sequence data?
levels as follows: (1) Separate gene trees are inferred from each linkage partition, and (2) the species phylogeny is then inferred from the set of gene trees. A method (Maddison, 1997; Page and Charleston, 1997a, 1997b; Slowinski et al., 1997) termed gene tree parsimony by Slowinski et al. (1997) is the appropriate method for implementing the second step. Gene tree parsimony operates by findin...
متن کاملImpact of the Partitioning Scheme on Divergence Times Inferred from Mammalian Genomic Data Sets
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the im...
متن کاملLearning from Data Sets with Missing Labels
This paper consider the task of learning discriminative classifiers of data when some class labels are missing from the data set(so-called “semi-supervised” learning), specifically when the labeled data are not drawn from the same distribution as the unlabeled data. This is an important issue in domains in which learning from only the labeled samples can result in a classifier that is not appro...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Molecular Biology and Evolution
سال: 2012
ISSN: 1537-1719,0737-4038
DOI: 10.1093/molbev/mss208